Overview

Dataset statistics

Number of variables16
Number of observations253155
Missing cells53018
Missing cells (%)1.3%
Duplicate rows6
Duplicate rows (%)< 0.1%
Total size in memory30.9 MiB
Average record size in memory128.0 B

Variable types

Numeric9
Categorical7

Alerts

Dataset has 6 (< 0.1%) duplicate rowsDuplicates
Store_name has a high cardinality: 17738 distinct values High cardinality
cate2 has a high cardinality: 74 distinct values High cardinality
ID has a high cardinality: 1727 distinct values High cardinality
jeju_person_sales is highly correlated with jeju_person_sales_num and 2 other fieldsHigh correlation
jeju_person_sales_num is highly correlated with jeju_person_sales and 1 other fieldsHigh correlation
other_person_sales is highly correlated with other_person_sales_num and 3 other fieldsHigh correlation
other_person_sales_num is highly correlated with other_person_sales and 3 other fieldsHigh correlation
tot_sales is highly correlated with other_person_sales and 3 other fieldsHigh correlation
tot_sales_num is highly correlated with jeju_person_sales and 5 other fieldsHigh correlation
change is highly correlated with jeju_person_sales and 5 other fieldsHigh correlation
ranking is highly correlated with Date and 1 other fieldsHigh correlation
si_gune_gu is highly correlated with Dong and 1 other fieldsHigh correlation
Dong is highly correlated with si_gune_gu and 1 other fieldsHigh correlation
loc is highly correlated with si_gune_gu and 1 other fieldsHigh correlation
cate1 is highly correlated with cate2High correlation
cate2 is highly correlated with cate1High correlation
Date is highly correlated with rankingHigh correlation
change has 26509 (10.5%) missing values Missing
ranking has 26509 (10.5%) missing values Missing
other_person_sales_num is highly skewed (γ1 = 20.90775832) Skewed
jeju_person_sales has 12767 (5.0%) zeros Zeros
jeju_person_sales_num has 11661 (4.6%) zeros Zeros
other_person_sales has 82421 (32.6%) zeros Zeros
other_person_sales_num has 27697 (10.9%) zeros Zeros
tot_sales has 21173 (8.4%) zeros Zeros
change has 10216 (4.0%) zeros Zeros

Reproduction

Analysis started2022-11-26 01:19:15.254168
Analysis finished2022-11-26 01:19:50.387404
Duration35.13 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

Date
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202144.7132
Minimum202101
Maximum202207
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:50.481687image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum202101
5-th percentile202101
Q1202106
median202111
Q3202203
95-th percentile202207
Maximum202207
Range106
Interquartile range (IQR)97

Descriptive statistics

Standard deviation47.35496332
Coefficient of variation (CV)0.0002342626852
Kurtosis-1.78304966
Mean202144.7132
Median Absolute Deviation (MAD)7
Skewness0.449001644
Sum5.117394488 × 1010
Variance2242.492551
MonotonicityIncreasing
2022-11-26T10:19:50.608342image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
20211114585
 
5.8%
20211014514
 
5.7%
20211214514
 
5.7%
20220114332
 
5.7%
20210714272
 
5.6%
20220314226
 
5.6%
20220414189
 
5.6%
20210814122
 
5.6%
20210914117
 
5.6%
20220214117
 
5.6%
Other values (8)110167
43.5%
ValueCountFrequency (%)
20210112757
5.0%
20210313733
5.4%
20210413941
5.5%
20210514103
5.6%
20210614055
5.6%
20210714272
5.6%
20210814122
5.6%
20210914117
5.6%
20211014514
5.7%
20211114585
5.8%
ValueCountFrequency (%)
20220713581
5.4%
20220613883
5.5%
20220514114
5.6%
20220414189
5.6%
20220314226
5.6%
20220214117
5.6%
20220114332
5.7%
20211214514
5.7%
20211114585
5.8%
20211014514
5.7%

Store_name
Categorical

HIGH CARDINALITY

Distinct17738
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
으뜸아이스크림할인점
 
57
한라식당
 
54
30년할매닭발
 
54
고사리식당
 
53
봉봉
 
40
Other values (17733)
252897 

Length

Max length38
Median length30
Mean length5.990938358
Min length1

Characters and Unicode

Total characters1516636
Distinct characters1236
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique429 ?
Unique (%)0.2%

Sample

1st row한국맥도날드(유)제주노형점
2nd row버거킹제주이마트점
3rd row투썸플레이스 제주노형오거리점
4th row뚜레쥬르 노형오거리점
5th row에이바우트커피아이파크점

Common Values

ValueCountFrequency (%)
으뜸아이스크림할인점57
 
< 0.1%
한라식당54
 
< 0.1%
30년할매닭발54
 
< 0.1%
고사리식당53
 
< 0.1%
봉봉40
 
< 0.1%
가람39
 
< 0.1%
통큰코다리37
 
< 0.1%
봄봄37
 
< 0.1%
왕천파닭37
 
< 0.1%
쭈꾸쭈꾸쭈꾸미36
 
< 0.1%
Other values (17728)252711
99.8%

Length

2022-11-26T10:19:50.749197image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
주식회사2889
 
1.0%
노형점601
 
0.2%
서귀포점538
 
0.2%
제주412
 
0.1%
연동점408
 
0.1%
제주점404
 
0.1%
중문점380
 
0.1%
삼화점365
 
0.1%
아라점364
 
0.1%
신제주점364
 
0.1%
Other values (18033)276534
97.6%

Most occurring characters

ValueCountFrequency (%)
55973
 
3.7%
36073
 
2.4%
34891
 
2.3%
31261
 
2.1%
30861
 
2.0%
23383
 
1.5%
21204
 
1.4%
19757
 
1.3%
18178
 
1.2%
17420
 
1.1%
Other values (1226)1227635
80.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter1432016
94.4%
Space Separator31261
 
2.1%
Decimal Number18373
 
1.2%
Uppercase Letter10686
 
0.7%
Lowercase Letter10554
 
0.7%
Close Punctuation5906
 
0.4%
Open Punctuation5903
 
0.4%
Other Punctuation1754
 
0.1%
Dash Punctuation149
 
< 0.1%
Connector Punctuation34
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55973
 
3.9%
36073
 
2.5%
34891
 
2.4%
30861
 
2.2%
23383
 
1.6%
21204
 
1.5%
19757
 
1.4%
18178
 
1.3%
17420
 
1.2%
15722
 
1.1%
Other values (1150)1158554
80.9%
Lowercase Letter
ValueCountFrequency (%)
e1387
13.1%
o1319
12.5%
a1130
 
10.7%
r621
 
5.9%
i615
 
5.8%
t522
 
4.9%
f470
 
4.5%
n447
 
4.2%
s444
 
4.2%
u412
 
3.9%
Other values (16)3187
30.2%
Uppercase Letter
ValueCountFrequency (%)
A979
 
9.2%
O807
 
7.6%
E738
 
6.9%
T684
 
6.4%
B638
 
6.0%
D552
 
5.2%
N547
 
5.1%
S513
 
4.8%
L495
 
4.6%
C494
 
4.6%
Other values (16)4239
39.7%
Decimal Number
ValueCountFrequency (%)
23521
19.2%
13360
18.3%
02618
14.2%
31970
10.7%
91380
 
7.5%
41337
 
7.3%
71239
 
6.7%
51143
 
6.2%
81084
 
5.9%
6721
 
3.9%
Other Punctuation
ValueCountFrequency (%)
&662
37.7%
.661
37.7%
,173
 
9.9%
?84
 
4.8%
/71
 
4.0%
'63
 
3.6%
:18
 
1.0%
!11
 
0.6%
;11
 
0.6%
Space Separator
ValueCountFrequency (%)
31261
100.0%
Close Punctuation
ValueCountFrequency (%)
)5906
100.0%
Open Punctuation
ValueCountFrequency (%)
(5903
100.0%
Dash Punctuation
ValueCountFrequency (%)
-149
100.0%
Connector Punctuation
ValueCountFrequency (%)
_34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul1431917
94.4%
Common63380
 
4.2%
Latin21240
 
1.4%
Han99
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55973
 
3.9%
36073
 
2.5%
34891
 
2.4%
30861
 
2.2%
23383
 
1.6%
21204
 
1.5%
19757
 
1.4%
18178
 
1.3%
17420
 
1.2%
15722
 
1.1%
Other values (1145)1158455
80.9%
Latin
ValueCountFrequency (%)
e1387
 
6.5%
o1319
 
6.2%
a1130
 
5.3%
A979
 
4.6%
O807
 
3.8%
E738
 
3.5%
T684
 
3.2%
B638
 
3.0%
r621
 
2.9%
i615
 
2.9%
Other values (42)12322
58.0%
Common
ValueCountFrequency (%)
31261
49.3%
)5906
 
9.3%
(5903
 
9.3%
23521
 
5.6%
13360
 
5.3%
02618
 
4.1%
31970
 
3.1%
91380
 
2.2%
41337
 
2.1%
71239
 
2.0%
Other values (14)4885
 
7.7%
Han
ValueCountFrequency (%)
36
36.4%
18
18.2%
15
15.2%
15
15.2%
15
15.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul1431917
94.4%
ASCII84620
 
5.6%
CJK99
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
55973
 
3.9%
36073
 
2.5%
34891
 
2.4%
30861
 
2.2%
23383
 
1.6%
21204
 
1.5%
19757
 
1.4%
18178
 
1.3%
17420
 
1.2%
15722
 
1.1%
Other values (1145)1158455
80.9%
ASCII
ValueCountFrequency (%)
31261
36.9%
)5906
 
7.0%
(5903
 
7.0%
23521
 
4.2%
13360
 
4.0%
02618
 
3.1%
31970
 
2.3%
e1387
 
1.6%
91380
 
1.6%
41337
 
1.6%
Other values (66)25977
30.7%
CJK
ValueCountFrequency (%)
36
36.4%
18
18.2%
15
15.2%
15
15.2%
15
15.2%

si_gune_gu
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
제주시
175748 
서귀포시
77407 

Length

Max length4
Median length3
Mean length3.305769193
Min length3

Characters and Unicode

Total characters836872
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제주시
2nd row제주시
3rd row제주시
4th row제주시
5th row제주시

Common Values

ValueCountFrequency (%)
제주시175748
69.4%
서귀포시77407
30.6%

Length

2022-11-26T10:19:50.888118image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-26T10:19:51.006976image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
제주시175748
69.4%
서귀포시77407
30.6%

Most occurring characters

ValueCountFrequency (%)
253155
30.3%
175748
21.0%
175748
21.0%
77407
 
9.2%
77407
 
9.2%
77407
 
9.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter836872
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
253155
30.3%
175748
21.0%
175748
21.0%
77407
 
9.2%
77407
 
9.2%
77407
 
9.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul836872
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
253155
30.3%
175748
21.0%
175748
21.0%
77407
 
9.2%
77407
 
9.2%
77407
 
9.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul836872
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
253155
30.3%
175748
21.0%
175748
21.0%
77407
 
9.2%
77407
 
9.2%
77407
 
9.2%

Dong
Categorical

HIGH CORRELATION

Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
이도2동
21668 
연동
20889 
노형동
18107 
애월읍
 
13191
한림읍
 
10892
Other values (38)
168408 

Length

Max length4
Median length3
Mean length3.122889139
Min length2

Characters and Unicode

Total characters790575
Distinct characters63
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row노형동
2nd row노형동
3rd row노형동
4th row노형동
5th row노형동

Common Values

ValueCountFrequency (%)
이도2동21668
 
8.6%
연동20889
 
8.3%
노형동18107
 
7.2%
애월읍13191
 
5.2%
한림읍10892
 
4.3%
구좌읍10158
 
4.0%
조천읍10150
 
4.0%
성산읍9246
 
3.7%
아라동9045
 
3.6%
일도2동8555
 
3.4%
Other values (33)121254
47.9%

Length

2022-11-26T10:19:51.112051image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
이도2동21668
 
8.6%
연동20889
 
8.3%
노형동18107
 
7.2%
애월읍13191
 
5.2%
한림읍10892
 
4.3%
구좌읍10158
 
4.0%
조천읍10150
 
4.0%
성산읍9246
 
3.7%
아라동9045
 
3.6%
일도2동8555
 
3.4%
Other values (33)121254
47.9%

Most occurring characters

ValueCountFrequency (%)
172113
21.8%
67248
 
8.5%
54273
 
6.9%
238717
 
4.9%
26516
 
3.4%
20889
 
2.6%
19355
 
2.4%
18810
 
2.4%
18107
 
2.3%
18107
 
2.3%
Other values (53)336440
42.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter738576
93.4%
Decimal Number51999
 
6.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
172113
23.3%
67248
 
9.1%
54273
 
7.3%
26516
 
3.6%
20889
 
2.8%
19355
 
2.6%
18810
 
2.5%
18107
 
2.5%
18107
 
2.5%
17060
 
2.3%
Other values (51)306098
41.4%
Decimal Number
ValueCountFrequency (%)
238717
74.5%
113282
 
25.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul738576
93.4%
Common51999
 
6.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
172113
23.3%
67248
 
9.1%
54273
 
7.3%
26516
 
3.6%
20889
 
2.8%
19355
 
2.6%
18810
 
2.5%
18107
 
2.5%
18107
 
2.5%
17060
 
2.3%
Other values (51)306098
41.4%
Common
ValueCountFrequency (%)
238717
74.5%
113282
 
25.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul738576
93.4%
ASCII51999
 
6.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
172113
23.3%
67248
 
9.1%
54273
 
7.3%
26516
 
3.6%
20889
 
2.8%
19355
 
2.6%
18810
 
2.5%
18107
 
2.5%
18107
 
2.5%
17060
 
2.3%
Other values (51)306098
41.4%
ASCII
ValueCountFrequency (%)
238717
74.5%
113282
 
25.5%

loc
Categorical

HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
제주시 동지역
124484 
서귀포시 동지역
41458 
애월읍
13191 
한림읍
 
10892
구좌읍
 
10158
Other values (11)
52972 

Length

Max length8
Median length7
Mean length5.785747862
Min length3

Characters and Unicode

Total characters1464691
Distinct characters39
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제주시 동지역
2nd row제주시 동지역
3rd row제주시 동지역
4th row제주시 동지역
5th row제주시 동지역

Common Values

ValueCountFrequency (%)
제주시 동지역124484
49.2%
서귀포시 동지역41458
 
16.4%
애월읍13191
 
5.2%
한림읍10892
 
4.3%
구좌읍10158
 
4.0%
조천읍10150
 
4.0%
성산읍9246
 
3.7%
대정읍8322
 
3.3%
안덕면6974
 
2.8%
표선면6118
 
2.4%
Other values (6)12162
 
4.8%

Length

2022-11-26T10:19:51.243971image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
동지역165942
39.6%
제주시124484
29.7%
서귀포시41458
 
9.9%
애월읍13191
 
3.1%
한림읍10892
 
2.6%
구좌읍10158
 
2.4%
조천읍10150
 
2.4%
성산읍9246
 
2.2%
대정읍8322
 
2.0%
안덕면6974
 
1.7%
Other values (7)18280
 
4.4%

Most occurring characters

ValueCountFrequency (%)
166552
11.4%
165942
11.3%
165942
11.3%
165942
11.3%
165942
11.3%
124484
8.5%
124484
8.5%
67248
 
4.6%
41458
 
2.8%
41458
 
2.8%
Other values (29)235239
16.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter1298749
88.7%
Space Separator165942
 
11.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
166552
12.8%
165942
12.8%
165942
12.8%
165942
12.8%
124484
9.6%
124484
9.6%
67248
 
5.2%
41458
 
3.2%
41458
 
3.2%
41458
 
3.2%
Other values (28)193781
14.9%
Space Separator
ValueCountFrequency (%)
165942
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul1298749
88.7%
Common165942
 
11.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
166552
12.8%
165942
12.8%
165942
12.8%
165942
12.8%
124484
9.6%
124484
9.6%
67248
 
5.2%
41458
 
3.2%
41458
 
3.2%
41458
 
3.2%
Other values (28)193781
14.9%
Common
ValueCountFrequency (%)
165942
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul1298749
88.7%
ASCII165942
 
11.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
166552
12.8%
165942
12.8%
165942
12.8%
165942
12.8%
124484
9.6%
124484
9.6%
67248
 
5.2%
41458
 
3.2%
41458
 
3.2%
41458
 
3.2%
Other values (28)193781
14.9%
ASCII
ValueCountFrequency (%)
165942
100.0%

cate1
Categorical

HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
한식
180747 
간식
 
16419
음료
 
14677
아시아음식
 
13909
패스트푸드
 
13776
Other values (4)
 
13627

Length

Max length9
Median length2
Mean length2.447346487
Min length2

Characters and Unicode

Total characters619558
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row패스트푸드
2nd row패스트푸드
3rd row음료
4th row간식
5th row음료

Common Values

ValueCountFrequency (%)
한식180747
71.4%
간식16419
 
6.5%
음료14677
 
5.8%
아시아음식13909
 
5.5%
패스트푸드13776
 
5.4%
양식7754
 
3.1%
주점및주류판매4752
 
1.9%
주점 및 주류판매919
 
0.4%
부페202
 
0.1%

Length

2022-11-26T10:19:51.436176image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-26T10:19:51.599597image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
한식180747
70.9%
간식16419
 
6.4%
음료14677
 
5.8%
아시아음식13909
 
5.5%
패스트푸드13776
 
5.4%
양식7754
 
3.0%
주점및주류판매4752
 
1.9%
주점919
 
0.4%
919
 
0.4%
주류판매919
 
0.4%

Most occurring characters

ValueCountFrequency (%)
218829
35.3%
180747
29.2%
28586
 
4.6%
27818
 
4.5%
16419
 
2.7%
14677
 
2.4%
13909
 
2.2%
13776
 
2.2%
13776
 
2.2%
13776
 
2.2%
Other values (12)77245
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter617720
99.7%
Space Separator1838
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
218829
35.4%
180747
29.3%
28586
 
4.6%
27818
 
4.5%
16419
 
2.7%
14677
 
2.4%
13909
 
2.3%
13776
 
2.2%
13776
 
2.2%
13776
 
2.2%
Other values (11)75407
 
12.2%
Space Separator
ValueCountFrequency (%)
1838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul617720
99.7%
Common1838
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
218829
35.4%
180747
29.3%
28586
 
4.6%
27818
 
4.5%
16419
 
2.7%
14677
 
2.4%
13909
 
2.3%
13776
 
2.2%
13776
 
2.2%
13776
 
2.2%
Other values (11)75407
 
12.2%
Common
ValueCountFrequency (%)
1838
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul617720
99.7%
ASCII1838
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
218829
35.4%
180747
29.3%
28586
 
4.6%
27818
 
4.5%
16419
 
2.7%
14677
 
2.4%
13909
 
2.3%
13776
 
2.2%
13776
 
2.2%
13776
 
2.2%
Other values (11)75407
 
12.2%
ASCII
ValueCountFrequency (%)
1838
100.0%

cate2
Categorical

HIGH CARDINALITY
HIGH CORRELATION

Distinct74
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
가정식
74816 
단품요리 전문
73696 
커피
17289 
치킨
11185 
베이커리
7493 
Other values (69)
68676 

Length

Max length9
Median length8
Mean length4.035432838
Min length1

Characters and Unicode

Total characters1021590
Distinct characters134
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row햄버거
2nd row햄버거
3rd row커피
4th row베이커리
5th row커피

Common Values

ValueCountFrequency (%)
가정식74816
29.6%
단품요리 전문73696
29.1%
커피17289
 
6.8%
치킨11185
 
4.4%
베이커리7493
 
3.0%
중식6938
 
2.7%
양식6782
 
2.7%
일식6234
 
2.5%
분식5960
 
2.4%
돼지고기4939
 
2.0%
Other values (64)37823
14.9%

Length

2022-11-26T10:19:51.744258image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
가정식74816
22.9%
단품요리73696
22.5%
전문73696
22.5%
커피17289
 
5.3%
치킨11185
 
3.4%
베이커리7493
 
2.3%
중식6938
 
2.1%
양식6782
 
2.1%
일식6234
 
1.9%
분식5960
 
1.8%
Other values (66)43035
13.2%

Most occurring characters

ValueCountFrequency (%)
101689
10.0%
84466
 
8.3%
76298
 
7.5%
74816
 
7.3%
74816
 
7.3%
73969
 
7.2%
73696
 
7.2%
73696
 
7.2%
73696
 
7.2%
73696
 
7.2%
Other values (124)240752
23.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter940875
92.1%
Space Separator73969
 
7.2%
Other Punctuation6746
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
101689
10.8%
84466
9.0%
76298
 
8.1%
74816
 
8.0%
74816
 
8.0%
73696
 
7.8%
73696
 
7.8%
73696
 
7.8%
73696
 
7.8%
24782
 
2.6%
Other values (122)209224
22.2%
Space Separator
ValueCountFrequency (%)
73969
100.0%
Other Punctuation
ValueCountFrequency (%)
/6746
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul940875
92.1%
Common80715
 
7.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
101689
10.8%
84466
9.0%
76298
 
8.1%
74816
 
8.0%
74816
 
8.0%
73696
 
7.8%
73696
 
7.8%
73696
 
7.8%
73696
 
7.8%
24782
 
2.6%
Other values (122)209224
22.2%
Common
ValueCountFrequency (%)
73969
91.6%
/6746
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul940875
92.1%
ASCII80715
 
7.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
101689
10.8%
84466
9.0%
76298
 
8.1%
74816
 
8.0%
74816
 
8.0%
73696
 
7.8%
73696
 
7.8%
73696
 
7.8%
73696
 
7.8%
24782
 
2.6%
Other values (122)209224
22.2%
ASCII
ValueCountFrequency (%)
73969
91.6%
/6746
 
8.4%

jeju_person_sales
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct1387
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6848708104
Minimum0
Maximum46.16
Zeros12767
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:51.949026image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.11
median0.33
Q30.8
95-th percentile2.46
Maximum46.16
Range46.16
Interquartile range (IQR)0.69

Descriptive statistics

Standard deviation1.224056655
Coefficient of variation (CV)1.7872811
Kurtosis156.8893962
Mean0.6848708104
Median Absolute Deviation (MAD)0.27
Skewness8.774687411
Sum173378.47
Variance1.498314696
MonotonicityNot monotonic
2022-11-26T10:19:52.145189image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
012767
 
5.0%
0.035471
 
2.2%
0.025427
 
2.1%
0.045305
 
2.1%
0.065006
 
2.0%
0.054930
 
1.9%
0.014784
 
1.9%
0.074703
 
1.9%
0.084578
 
1.8%
0.094269
 
1.7%
Other values (1377)195915
77.4%
ValueCountFrequency (%)
012767
5.0%
0.014784
 
1.9%
0.025427
2.1%
0.035471
2.2%
0.045305
2.1%
0.054930
 
1.9%
0.065006
 
2.0%
0.074703
 
1.9%
0.084578
 
1.8%
0.094269
 
1.7%
ValueCountFrequency (%)
46.161
< 0.1%
42.631
< 0.1%
421
< 0.1%
41.871
< 0.1%
41.661
< 0.1%
41.561
< 0.1%
39.231
< 0.1%
38.171
< 0.1%
37.991
< 0.1%
37.241
< 0.1%

jeju_person_sales_num
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct1932
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8912782288
Minimum0
Maximum102.4
Zeros11661
Zeros (%)4.6%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:52.357580image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.02
Q10.11
median0.34
Q30.87
95-th percentile3.39
Maximum102.4
Range102.4
Interquartile range (IQR)0.76

Descriptive statistics

Standard deviation2.316831634
Coefficient of variation (CV)2.599448252
Kurtosis639.4458283
Mean0.8912782288
Median Absolute Deviation (MAD)0.28
Skewness18.68913712
Sum225631.54
Variance5.367708821
MonotonicityNot monotonic
2022-11-26T10:19:52.510041image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0314703
 
5.8%
011661
 
4.6%
0.057845
 
3.1%
0.066429
 
2.5%
0.086328
 
2.5%
0.115983
 
2.4%
0.135896
 
2.3%
0.094917
 
1.9%
0.164422
 
1.7%
0.144244
 
1.7%
Other values (1922)180727
71.4%
ValueCountFrequency (%)
011661
4.6%
0.01147
 
0.1%
0.021753
 
0.7%
0.0314703
5.8%
0.04111
 
< 0.1%
0.057845
3.1%
0.066429
2.5%
0.07682
 
0.3%
0.086328
2.5%
0.094917
 
1.9%
ValueCountFrequency (%)
102.41
< 0.1%
102.281
< 0.1%
102.131
< 0.1%
102.051
< 0.1%
102.041
< 0.1%
1022
< 0.1%
101.851
< 0.1%
101.761
< 0.1%
101.741
< 0.1%
101.731
< 0.1%

other_person_sales
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct427
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06022642255
Minimum0
Maximum12.07
Zeros82421
Zeros (%)32.6%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:52.657826image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.01
Q30.04
95-th percentile0.26
Maximum12.07
Range12.07
Interquartile range (IQR)0.04

Descriptive statistics

Standard deviation0.2055094313
Coefficient of variation (CV)3.412280235
Kurtosis532.6483432
Mean0.06022642255
Median Absolute Deviation (MAD)0.01
Skewness16.52111212
Sum15246.62
Variance0.04223412635
MonotonicityNot monotonic
2022-11-26T10:19:52.833678image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
082421
32.6%
0.0153975
21.3%
0.0227091
 
10.7%
0.0316825
 
6.6%
0.0411402
 
4.5%
0.058327
 
3.3%
0.066310
 
2.5%
0.075057
 
2.0%
0.084101
 
1.6%
0.093391
 
1.3%
Other values (417)34255
13.5%
ValueCountFrequency (%)
082421
32.6%
0.0153975
21.3%
0.0227091
 
10.7%
0.0316825
 
6.6%
0.0411402
 
4.5%
0.058327
 
3.3%
0.066310
 
2.5%
0.075057
 
2.0%
0.084101
 
1.6%
0.093391
 
1.3%
ValueCountFrequency (%)
12.071
< 0.1%
11.791
< 0.1%
11.471
< 0.1%
10.991
< 0.1%
10.941
< 0.1%
10.811
< 0.1%
9.961
< 0.1%
9.91
< 0.1%
9.861
< 0.1%
9.71
< 0.1%

other_person_sales_num
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct1568
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3886901701
Minimum0
Maximum100
Zeros27697
Zeros (%)10.9%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:53.075790image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.02
median0.09
Q30.3
95-th percentile1.48
Maximum100
Range100
Interquartile range (IQR)0.28

Descriptive statistics

Standard deviation1.498642015
Coefficient of variation (CV)3.855621084
Kurtosis756.8612729
Mean0.3886901701
Median Absolute Deviation (MAD)0.08
Skewness20.90775832
Sum98398.86
Variance2.245927888
MonotonicityNot monotonic
2022-11-26T10:19:53.293827image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
027697
 
10.9%
0.0122308
 
8.8%
0.0215949
 
6.3%
0.0313980
 
5.5%
0.0413738
 
5.4%
0.059217
 
3.6%
0.077528
 
3.0%
0.087426
 
2.9%
0.067062
 
2.8%
0.096153
 
2.4%
Other values (1558)122097
48.2%
ValueCountFrequency (%)
027697
10.9%
0.0122308
8.8%
0.0215949
6.3%
0.0313980
5.5%
0.0413738
5.4%
0.059217
 
3.6%
0.067062
 
2.8%
0.077528
 
3.0%
0.087426
 
2.9%
0.096153
 
2.4%
ValueCountFrequency (%)
1003
< 0.1%
90.91
 
< 0.1%
87.291
 
< 0.1%
75.011
 
< 0.1%
70.461
 
< 0.1%
68.881
 
< 0.1%
66.151
 
< 0.1%
65.641
 
< 0.1%
65.191
 
< 0.1%
62.841
 
< 0.1%

tot_sales
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct437
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1000497324
Minimum0
Maximum12.03
Zeros21173
Zeros (%)8.4%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:53.504298image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.02
median0.04
Q30.1
95-th percentile0.36
Maximum12.03
Range12.03
Interquartile range (IQR)0.08

Descriptive statistics

Standard deviation0.2259983061
Coefficient of variation (CV)2.258859676
Kurtosis377.6456646
Mean0.1000497324
Median Absolute Deviation (MAD)0.03
Skewness13.48981139
Sum25328.09
Variance0.05107523434
MonotonicityNot monotonic
2022-11-26T10:19:53.718020image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0137104
14.7%
0.0229307
11.6%
0.0323172
 
9.2%
021173
 
8.4%
0.0418815
 
7.4%
0.0515443
 
6.1%
0.0612720
 
5.0%
0.0710463
 
4.1%
0.088999
 
3.6%
0.097649
 
3.0%
Other values (427)68310
27.0%
ValueCountFrequency (%)
021173
8.4%
0.0137104
14.7%
0.0229307
11.6%
0.0323172
9.2%
0.0418815
7.4%
0.0515443
6.1%
0.0612720
 
5.0%
0.0710463
 
4.1%
0.088999
 
3.6%
0.097649
 
3.0%
ValueCountFrequency (%)
12.031
< 0.1%
11.721
< 0.1%
11.541
< 0.1%
11.011
< 0.1%
10.961
< 0.1%
10.81
< 0.1%
10.051
< 0.1%
10.011
< 0.1%
9.991
< 0.1%
9.81
< 0.1%

tot_sales_num
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1809
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6426482195
Minimum0
Maximum102.23
Zeros496
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:53.894880image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.02
Q10.09
median0.24
Q30.6
95-th percentile2.34
Maximum102.23
Range102.23
Interquartile range (IQR)0.51

Descriptive statistics

Standard deviation1.796624623
Coefficient of variation (CV)2.795657979
Kurtosis498.726249
Mean0.6426482195
Median Absolute Deviation (MAD)0.18
Skewness16.73239402
Sum162689.61
Variance3.227860035
MonotonicityNot monotonic
2022-11-26T10:19:54.111718image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0110143
 
4.0%
0.038182
 
3.2%
0.028031
 
3.2%
0.077156
 
2.8%
0.046997
 
2.8%
0.056871
 
2.7%
0.096046
 
2.4%
0.065828
 
2.3%
0.085461
 
2.2%
0.125431
 
2.1%
Other values (1799)183009
72.3%
ValueCountFrequency (%)
0496
 
0.2%
0.0110143
4.0%
0.028031
3.2%
0.038182
3.2%
0.046997
2.8%
0.056871
2.7%
0.065828
2.3%
0.077156
2.8%
0.085461
2.2%
0.096046
2.4%
ValueCountFrequency (%)
102.231
< 0.1%
1001
< 0.1%
99.151
< 0.1%
94.591
< 0.1%
93.31
< 0.1%
90.951
< 0.1%
81.71
< 0.1%
771
< 0.1%
73.391
< 0.1%
71.031
< 0.1%

change
Real number (ℝ)

HIGH CORRELATION
MISSING
ZEROS

Distinct1741
Distinct (%)0.8%
Missing26509
Missing (%)10.5%
Infinite0
Infinite (%)0.0%
Mean-0.08994246534
Minimum-62.92
Maximum62.27
Zeros10216
Zeros (%)4.0%
Negative128371
Negative (%)50.7%
Memory size1.9 MiB
2022-11-26T10:19:54.316976image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-62.92
5-th percentile-0.81
Q1-0.14
median-0.02
Q30.05
95-th percentile0.45
Maximum62.27
Range125.19
Interquartile range (IQR)0.19

Descriptive statistics

Standard deviation0.9898825882
Coefficient of variation (CV)-11.0057311
Kurtosis599.9457333
Mean-0.08994246534
Median Absolute Deviation (MAD)0.09
Skewness-5.484058496
Sum-20385.1
Variance0.9798675384
MonotonicityNot monotonic
2022-11-26T10:19:54.500685image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.0110740
 
4.2%
010216
 
4.0%
0.019650
 
3.8%
-0.029011
 
3.6%
-0.037427
 
2.9%
0.027423
 
2.9%
-0.046514
 
2.6%
0.036261
 
2.5%
-0.055740
 
2.3%
0.045465
 
2.2%
Other values (1731)148199
58.5%
(Missing)26509
 
10.5%
ValueCountFrequency (%)
-62.921
< 0.1%
-54.61
< 0.1%
-47.151
< 0.1%
-47.021
< 0.1%
-43.261
< 0.1%
-40.621
< 0.1%
-40.271
< 0.1%
-39.571
< 0.1%
-38.321
< 0.1%
-36.871
< 0.1%
ValueCountFrequency (%)
62.271
< 0.1%
50.871
< 0.1%
47.351
< 0.1%
37.81
< 0.1%
35.71
< 0.1%
34.561
< 0.1%
33.571
< 0.1%
32.711
< 0.1%
29.751
< 0.1%
29.291
< 0.1%

ranking
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct18047
Distinct (%)8.0%
Missing26509
Missing (%)10.5%
Infinite0
Infinite (%)0.0%
Mean7484.957849
Minimum0.01
Maximum17450
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2022-11-26T10:19:55.053939image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.67
Q13001
median7588
Q311578
95-th percentile15465.75
Maximum17450
Range17449.99
Interquartile range (IQR)8577

Descriptive statistics

Standard deviation4918.986577
Coefficient of variation (CV)0.6571829362
Kurtosis-1.165867871
Mean7484.957849
Median Absolute Deviation (MAD)4266
Skewness0.04967723568
Sum1696435757
Variance24196428.94
MonotonicityNot monotonic
2022-11-26T10:19:55.190726image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.11569
 
0.2%
0.01494
 
0.2%
0.14485
 
0.2%
0.02404
 
0.2%
0.03383
 
0.2%
0.05349
 
0.1%
0.06342
 
0.1%
0.04337
 
0.1%
0.07329
 
0.1%
0.1321
 
0.1%
Other values (18037)222633
87.9%
(Missing)26509
 
10.5%
ValueCountFrequency (%)
0.01494
0.2%
0.02404
0.2%
0.03383
0.2%
0.04337
0.1%
0.05349
0.1%
0.06342
0.1%
0.07329
0.1%
0.08308
0.1%
0.09260
0.1%
0.1321
0.1%
ValueCountFrequency (%)
174502
< 0.1%
174492
< 0.1%
174482
< 0.1%
174472
< 0.1%
174462
< 0.1%
174452
< 0.1%
174442
< 0.1%
174432
< 0.1%
174422
< 0.1%
174412
< 0.1%

ID
Categorical

HIGH CARDINALITY

Distinct1727
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
다다09b01a
 
5097
다나06a99b
 
4224
다나12b73a
 
3864
다다06a00a
 
3640
다나12b73b
 
3610
Other values (1722)
232720 

Length

Max length8
Median length8
Mean length7.993695562
Min length1

Characters and Unicode

Total characters2023644
Distinct characters14
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)< 0.1%

Sample

1st row다나05a99b
2nd row다나05a99b
3rd row다나05a99b
4th row다나05a99b
5th row다나05a99b

Common Values

ValueCountFrequency (%)
다다09b01a5097
 
2.0%
다나06a99b4224
 
1.7%
다나12b73a3864
 
1.5%
다다06a00a3640
 
1.4%
다나12b73b3610
 
1.4%
다다05a00a3457
 
1.4%
다나05b99b3018
 
1.2%
다다09b02b2710
 
1.1%
다다11a01b2633
 
1.0%
다다06b00a2566
 
1.0%
Other values (1717)218336
86.2%

Length

2022-11-26T10:19:55.372587image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
다다09b01a5097
 
2.0%
다나06a99b4224
 
1.7%
다나12b73a3864
 
1.5%
다다06a00a3640
 
1.4%
다나12b73b3610
 
1.4%
다다05a00a3457
 
1.4%
다나05b99b3018
 
1.2%
다다09b02b2710
 
1.1%
다다11a01b2633
 
1.0%
다다06b00a2566
 
1.0%
Other values (1717)218336
86.2%

Most occurring characters

ValueCountFrequency (%)
314667
15.5%
0280139
13.8%
b256072
12.7%
a249782
12.3%
191187
9.4%
9151969
7.5%
1124448
 
6.1%
797535
 
4.8%
875547
 
3.7%
374220
 
3.7%
Other values (4)208078
10.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1011936
50.0%
Other Letter505854
25.0%
Lowercase Letter505854
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0280139
27.7%
9151969
15.0%
1124448
12.3%
797535
 
9.6%
875547
 
7.5%
374220
 
7.3%
269697
 
6.9%
453634
 
5.3%
548724
 
4.8%
636023
 
3.6%
Other Letter
ValueCountFrequency (%)
314667
62.2%
191187
37.8%
Lowercase Letter
ValueCountFrequency (%)
b256072
50.6%
a249782
49.4%

Most occurring scripts

ValueCountFrequency (%)
Common1011936
50.0%
Hangul505854
25.0%
Latin505854
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0280139
27.7%
9151969
15.0%
1124448
12.3%
797535
 
9.6%
875547
 
7.5%
374220
 
7.3%
269697
 
6.9%
453634
 
5.3%
548724
 
4.8%
636023
 
3.6%
Hangul
ValueCountFrequency (%)
314667
62.2%
191187
37.8%
Latin
ValueCountFrequency (%)
b256072
50.6%
a249782
49.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1517790
75.0%
Hangul505854
 
25.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
314667
62.2%
191187
37.8%
ASCII
ValueCountFrequency (%)
0280139
18.5%
b256072
16.9%
a249782
16.5%
9151969
10.0%
1124448
8.2%
797535
 
6.4%
875547
 
5.0%
374220
 
4.9%
269697
 
4.6%
453634
 
3.5%
Other values (2)84747
 
5.6%

Interactions

2022-11-26T10:19:46.692203image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:29.726681image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:31.917461image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:33.950761image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:36.137778image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:38.160039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:40.898585image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:43.126321image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:44.982688image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:46.905021image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:29.996460image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:32.174348image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:34.154003image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:36.389932image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:38.352115image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:41.179380image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:43.300121image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:45.179418image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:47.123978image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:30.190352image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:32.391069image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:34.380517image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:36.648038image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:38.562905image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:41.524911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:43.525822image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:45.413851image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:47.292785image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:30.426598image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:32.610931image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:34.619205image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:36.859652image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:38.777407image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:41.770906image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:43.731176image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:45.572282image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:47.523225image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:30.659703image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:32.843938image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:34.868038image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:37.058752image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:39.019723image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:42.016287image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:43.924872image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:45.794441image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:47.720307image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:30.949131image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:33.043898image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:35.113487image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:37.251767image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:39.890422image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:42.273347image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:44.117648image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:45.957915image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:47.901356image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:31.216886image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:33.250233image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:35.365146image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:37.488132image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:40.130487image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:42.530095image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:44.326526image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:46.174009image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:48.080860image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:31.452575image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:33.478602image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:35.594655image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:37.721201image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:40.369986image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:42.758941image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:44.550034image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:46.347276image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:48.260771image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:31.682600image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:33.694185image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:35.852305image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:37.943558image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:40.611979image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:42.948193image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:44.742032image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-26T10:19:46.513311image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-26T10:19:55.485424image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-26T10:19:55.684820image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-26T10:19:55.861027image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-26T10:19:56.068402image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-26T10:19:56.293386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-26T10:19:56.433520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-26T10:19:48.594238image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-26T10:19:49.182119image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-26T10:19:49.780853image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-11-26T10:19:50.035182image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

DateStore_namesi_gune_guDongloccate1cate2jeju_person_salesjeju_person_sales_numother_person_salesother_person_sales_numtot_salestot_sales_numchangerankingID
0202101한국맥도날드(유)제주노형점제주시노형동제주시 동지역패스트푸드햄버거29.3698.180.9027.714.68102.23NaNNaN다나05a99b
1202101버거킹제주이마트점제주시노형동제주시 동지역패스트푸드햄버거9.6029.890.3912.121.6333.69NaNNaN다나05a99b
2202101투썸플레이스 제주노형오거리점제주시노형동제주시 동지역음료커피3.4911.140.144.310.5912.40NaNNaN다나05a99b
3202101뚜레쥬르 노형오거리점제주시노형동제주시 동지역간식베이커리3.409.410.081.790.529.20NaNNaN다나05a99b
4202101에이바우트커피아이파크점제주시노형동제주시 동지역음료커피1.378.910.042.100.218.98NaNNaN다나05a99b
5202101던킨도너츠신제주점제주시노형동제주시 동지역간식도너츠2.668.440.041.600.388.24NaNNaN다나05a99b
6202101주식회사 파스쿠찌제주노형로터리점제주시노형동제주시 동지역음료커피2.178.220.030.920.317.58NaNNaN다나05a99b
7202101(유)아웃백스테이크하우스코리아제주점제주시노형동제주시 동지역양식패밀리 레스토랑19.137.970.672.593.148.53NaNNaN다나05a99b
8202101더치앤빈 노형점제주시노형동제주시 동지역음료커피1.297.430.042.140.217.77NaNNaN다나05a99b
9202101타이거커피 제주점제주시노형동제주시 동지역한식단품요리 전문0.996.590.011.030.146.28NaNNaN다나05a99b

Last rows

DateStore_namesi_gune_guDongloccate1cate2jeju_person_salesjeju_person_sales_numother_person_salesother_person_sales_numtot_salestot_sales_numchangerankingID
253145202207경성주막1929삼화점제주시화북동제주시 동지역한식가정식0.300.450.00.00.020.080.038665.0다다14a02b
253146202207현아식당제주시화북동제주시 동지역한식가정식0.040.280.00.00.000.050.123559.0다다13a02b
253147202207서로푸드제주시화북동제주시 동지역한식단품요리 전문0.040.130.00.00.000.020.0011660.0다다14a03a
253148202207투다리한라점제주시화북동제주시 동지역주점및주류판매꼬치구이0.390.550.00.00.030.100.114091.0다다13a03b
253149202207뱃사공제주시화북동제주시 동지역한식단품요리 전문0.560.250.00.00.040.040.047528.0다다13a03b
253150202207남문회센타제주시화북동제주시 동지역한식0.340.280.00.00.020.050.047757.0다다13a03a
253151202207황금성반점제주시화북동제주시 동지역아시아음식중식0.561.360.00.00.040.240.037906.0다다12a02b
253152202207남문두루치기제주시화북동제주시 동지역한식단품요리 전문0.160.300.00.00.010.060.029099.0다다13a03b
253153202207갈비정식제주시화북동제주시 동지역한식갈비0.180.230.00.00.010.040.056841.0다다13a03a
253154202207불타는여고24시떡볶이삼화점제주시화북동제주시 동지역간식분식0.180.430.00.00.010.070.057094.0다다14a03a

Duplicate rows

Most frequently occurring

DateStore_namesi_gune_guDongloccate1cate2jeju_person_salesjeju_person_sales_numother_person_salesother_person_sales_numtot_salestot_sales_numchangerankingID# duplicates
0202110금정아트 민화공방제주시일도1동제주시 동지역간식베이커리0.020.030.000.000.000.010.004141.0다다09a02b2
1202110칼맛서귀포시대륜동서귀포시 동지역아시아음식일식1.090.380.030.080.090.17-0.2815109.0다나08a74a2
22021111월구름서귀포시안덕면안덕면간식베이커리0.020.060.000.010.000.02-0.048195.0나나91a73a2
3202111올푸드서귀포시정방동서귀포시 동지역패스트푸드피자0.020.030.000.000.000.01-0.016120.0다나13a73a2
4202111칼맛서귀포시대륜동서귀포시 동지역아시아음식일식0.400.100.000.010.030.04-0.027179.0다나08a74a2
5202111패자부활전제주시이도2동제주시 동지역한식단품요리 전문0.080.090.000.010.010.04-0.058766.0다다09b01a2